Anomaly Detection in Log Files Using Selected Natural Language Processing Methods

نویسندگان

چکیده

In this article, we address the problem of detecting anomalies in system log files. Computer systems generate huge numbers events, which are noted event While most them report normal actions, an unusual entry may inform about a failure or malware infection. A human operator easily miss such entry; therefore, anomaly detection methods used for purpose. our work, approach known from natural language processing (NLP) domain, operates on so-called embeddings, that is vector representations words phrases. We describe improved version LogEvent2Vec algorithm, proposed 2020. contrast to original version, propose significant shortening analysis window, both increased accuracy and made further suspicious sequences much easier. experimented with various binary classifiers, as decision trees multilayer perceptrons (MLPs), Blue Gene/L dataset. showed selecting optimal classifier (in case, MLP) short sequence gave very good results. The algorithm yielded best F1-score 0.997, compared 0.886 algorithm.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Anomaly Detection from Log Files Using Data Mining Techniques

Log files are created by devices or systems in order to provide information about processes or actions that were performed. Detailed inspection of security logs can reveal potential security breaches and it can show us system weaknesses. In our work we propose a novel anomaly-based detection approach based on data mining techniques for log analysis. Our approach uses Apache Hadoop technique to ...

متن کامل

Computer Log Anomaly Detection Using Frequent Episodes

In this paper, we propose a set of algorithms to automate the detection of anomalous frequent episodes. The algorithms make use of the hierarchy and frequency of episodes present in an examined sequence of log data and in a history preceding it. The algorithms identify changes in a set of frequent episodes and their frequencies. We evaluate the algorithms and describe tests made using live comp...

متن کامل

Log File Anomaly Detection

Analysis of log files pertaining to a failed run can be a tedious task, especially if the file runs into thousands of lines. Using the recent development in text analysis using deep neural networks, we present a method to reduce effort needed to analyze the log file by highlighting the most probably useful text in the failed log file, which can assist in debugging the causes of the failure. In ...

متن کامل

Anomaly Detection in Log Records

Received Jan 2, 2018 Revised Mar 9, 2018 Accepted Mar 24, 2018 In recent times complex software systems are continuously generating application and server logs for the events which had occurred in the past. These generated logs can be utilized for anomaly and intrusion detection. These log files can be used for detecting certain types of abnormalities or exceptions such as spikes in HTTP reques...

متن کامل

Data Discovery and Anomaly Detection Using Atypicality: Signal Processing Methods

The aim of atypicality is to extract small, rare, unusual and interesting pieces out of big data. This complements statistics about typical data to give insight into data. In order to find such “interesting” parts of data, universal approaches are required, since it is not known in advance what we are looking for. We therefore base the atypicality criterion on codelength. In a prior paper we de...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Applied sciences

سال: 2022

ISSN: ['2076-3417']

DOI: https://doi.org/10.3390/app12105089